A Small World Network of Prime Numbers 
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According to Goldbach conjecture, any even number can be broken up as the sum of two prime 
numbers : n = p + q. We construct a network where each node is a prime number and corresponding 
to every even number n, we put a link between the component primes p and q. In most cases, an even 
number can be broken up in many ways, and then we chose one decomposition with a probability 
\p — q\ a . Through computation of average shortest distance and clustering coefficient, we conclude 
that for a > —1.8 the network is of small world type and for a < —1.8 it is of regular type. We also 
present a theoretical justification for such behaviour. 



I. INTRODUCTION 



The study of networks, with an emphasis on small- 
world behaviour and scale invariant properties has turned 
out to be very important for analysing the statistical 
properties of diverse type of systems [1]. A network is 
defined as a graph consisting of some "nodes" and some 
"links" (or edges). When each pair of node is connected 
by a link, the network becomes a trivial one. One there- 
fore links (or does not link) two nodes according to some 
intrinsic property of them. Depending on the context, 
different properties of the nodes are relevant in different 
networks, leading to different rules for linking. For ex- 
ample, in science collaboration network, each scientist is 
a node and two nodes (scientists) are linked when they 
are co-authors in at least one paper. In English language 
network, each word is a node and a pair of words are 
linked if (in one or more sentences) they appear side by 
side or one word apart. In this way, a network struc- 
ture can be identified in widely varied contexts. Once a 
network is identified, one can measure in it some char- 
acteristic properties like the average shortest distance, 
clustering coefficient, degree distribution etc. When the 
degree distribution decays as a power law, the network 
is said to be scale-free. When the average shortest dis- 
tance is small (increases only logarithmically with the 
size of the network) but the clustering coefficient is high 
compared to the random network, the network is called 
small world. Many natural and man-made networks [1] 
have been found to be scale- free and/or small- world. 

Recently, Corso [2] has considered a network where 
each node is a natural number and two nodes are linked 
if they share a common prime factor larger than some 
chosen lower limit. The network is not scale- free (except 
in some restricted sense) but is a small- world unless the 
lower limit is 1, that is, unless one considers all prime 
factors. Motivated by this study, we consider here a net- 
work where each node is a prime number. The rule of 
placing links will be explained in the next section. The 
rule relies upon the validity of what is known as Goldbach 
conjecture [3] and involves a tunable parameter. Depend- 
ing on the value of the parameter, we have a small-world 
or a regular network. We mention that our work has 



no connection with the issue of the validity of Goldbach 
conjecture. 

In the next section, we shall describe the network and 
present a theoretical analysis of its behaviour. In Section 
III we shall describe the computational studies and in 
Section IV present the general conclusions. 



II. THE MODEL 

Goldbach conjecture says that any even number (> 2) 
can be written as the sum of two prime numbers (often in 
more than one way). To construct our network, we start 
with the even number 8 and note that it can be broken 
up into primes as 8 = 3 + 5. (For avoiding uninteresting 
complications we do not consider even numbers below 
8.) We put (the first) two nodes in the network and label 
them as 3 and 5 and put a link between them. Now we 
consider all even numbers 10, 12, 14, • • • upto some N e 
and break up each of them into primes as n = p + q. If p 
is not already in the network, a new node labelled as p is 
added, and similarly for q. Then a link is put between p 
and q. A complication is that (as mentioned earlier) very 
often one even number can be broken up into primes in 
more than one way. If one puts links corresponding to 
all the decompositions, then one has a link between al- 
most every pair of nodes and the network becomes trivial. 
Therefore, with the following prescription we choose one 
way of decomposition for every even number, depending 
on the difference between the component primes. We 
calculate the difference A = \p — q\ between the two com- 
ponents for every break-up and choose one break-up with 
probability A", where a is a (in fact, the only) parame- 
ter of the model. For example, the number n = 24 can 
be broken up in three ways : 5+19, 7+17, and 11+13, 
with A=14, 10, and 2 respectively. In a large number of 
realisations, the link for 24 will be between 5 and 19 with 
probability p\ , between 7 and 17 with probability^? and 
between 11 and 13 with probability P3, where p\ = 14"/s, 
p 2 = I0 a /s and p 3 = 2 a /s with s = 14" + 10" + 2". As 
Pi +P2 +P3 = 1) we could realise the choice of prime-pair 
by calling a random number between and 1. 

One should note that since we put one link for each 
even number, for M even numbers one will have exactly 
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M links, but some N (< M) number of nodes. One 
should also note that through the parameter a we actu- 
ally control the difference between the prime pairs (cho- 
sen to be linked) for each even number. Thus, for a = 0, 
our choice is independent of the difference A, while for 
a = — oo (+00) the break-up with the smallest (largest) 
difference between the components is chosen. 

What type of network we have thus constructed ? To 
have an answer analytically, let us calculate the average 
value of A for a given even number n. This is 

<A>=l£A- 

where the sum extends over all Goldbach pairs of n and 
fl is the number of such pairs. For positive values of the 
power (a + 1), the value of this sum will be dominated 
by the terms with large A and < A > will hence in- 
crease as n increases. One of the chosen Goldbach pair 
will then become small, and these small numbers will be 
highly populated nodes. The network may therefore be 
expected to be of small-world type. On the other hand, 
for negative values of (a+ 1), the value of the sum will be 
dominated by small A values and for large n the quan- 
tity < A > will converge to some finite number. Both 
the chosen Goldbach pairs will be large (so that the dif- 
ference between them remains small) and no node will be 
very highly populated. The network is then likely to lose 
the small-world character. Although A runs over some 
(but not all) of the natural numbers, the convergence be- 
haviour of < A > may be expected to be the same as the 
Riemann zeta function £(— a — 1). As this function con- 
verges only for a < — 2, the change-over in the behaviour 
of the network is expected to occur around a = — 2. 

III. SIMULATION STUDIES 

To analyse the properties of the network by computer 
simulation, we measure several characteristics of the net- 
work, (i) Average shortest distance between two nodes 
(d) : The shortest distance between two nodes is the 
smallest number of links via which one can go from one 
node to the other. We have measured this quantity for 
all pairs of nodes and taken the average. Results for the 
measurement of this quantity is presented in Fig. 1 as a 
function of the number of nodes (N). It is observed that 
this quantity varies linearly with logarithm of the number 
of nodes as it happens for a small-world network. This 
behaviour prevails for all values of a upto a lower limit of 
ao = —1.8. In particular, as a varies from 5 to 1, the d 
- N (log-linear) plot moves upwards parallel to itself. As 
a decreases further, the lines continue to move upwards 
but the slope increases continuously until for a < a the 
line ceases to be straight and starts bending upwards. In 
this region, d varies linearly with TV, as it happens for a 
regular network. 
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FIG. 1. Average shortest distance (d) as a function of the 
number of nodes for the network constructed from prime num- 
bers (continuous line) . Also shown is the same plot for a ran- 
dom network having the same number of nodes (d' , dotted 
line). The lines for d and d! cross over at a = 1 and N = 1000. 
The numbers labelling the curves stand for the value of a. All 
results presented in this paper have been averaged over about 
20 realisations of the network. 
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FIG. 2. Plot of p(j) as a function of j, where p(J) is the 
probability that a pair of nodes chosen randomly from the 
network will be j distance apart. The numbers labelling the 
curves stand for the value of a. N = 5000. 
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We have also plotted in Fig. 1, the average shortest 
distance (d 1 ) for a random network having the same num- 
ber of nodes. As is well-known [1], the d! - N (log-linear) 
plot is a straight line for the entire range of values of a 
and the lines maintain a constant slope and move grad- 
ually upwards as a increases. (However, for a > 2 the d' 
- N plot does not depend much on the value of a.) 
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FIG. 3. Plot of average shortest distance d as a function of 
a for different values of the number of nodes N. The numbers 
labelling the curves stand for the value of N. 

To gain further insight into the change of behaviour 
of d as a function of a, we have measured (Fig. 2) the 
probability p(j) that a pair of nodes chosen randomly 
from the network will be j distance apart. It is observed 
that for a > ao, p(j) is high only for small values of j, 
indicating that most of the pairs of nodes are at a small 
distance apart and d (which is nothing but ^ - jp(J)) is 
small. On the other hand, for a < ao, the distribution 
p(j) is fiat over a large range of values of j, indicating 
that the distance between a pair of nodes will also often 
be large and d will be high due to the contribution from 
high j-values. 

Lastly, in order to ascertain how sharply the change 
of behaviour occurs at a — ao, we plot d against a for 
different values of N (Fig. 3). The rise of d for a < a n 
becomes sharper and sharper as N increases. Our esti- 
mate of ao = —1.8 is based on simulations of networks 
with at most 5000 nodes. 

(ii) Clustering coefficient (C) : The clustering coeffi- 
cient Ci for the node i, is defined as the ratio Mj/mj 
where Mj is the actual number of links among the neigh- 
bours of the node i, and m, is the number of all possible 
links among the neighbours of i. (Thus, if i has degree 
hi, then m, = h(ki + l)/2.) The clustering coefficient 



for the entire network is defined as the average of Ci over 
all nodes i. This quantity C has been measured in our 
network and compared with the same (C) for a random 
network with the same number of nodes (Fig. 4). It 
is found that C > C for the entire range of values of 
a investigated here. This behaviour, combined with the 
results for d, leads us to conclude that the network is 
of small-world type for a > ao and of regular type for 
a < ao- 
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FIG. 4. Clustering coefficient (C) as a function of the num- 
ber of nodes (N) for the network constructed from prime 
numbers (continuous line)i. Also shown is the clustering co- 
efficient C' (dotted line) for for a random network having the 
same number of nodes. For the entire range of a values C re- 
mains larger than C' . The numbers labelling the curves stand 
for the value of a. 

The details of the behaviour of the clustering coeffi- 
cient as a function of the number of nodes is as follows. 
For a given a, C decays algebraically with N and the 
C - N line rises, maintaining a constant slope as a is 
increased from -0.5. But as a is decreased from -0.5, the 
line remains straight, rises upwards but becomes more 
and more horizontal. The slope almost vanishes (partic- 
ularly in log- log scale) for a < ao- For the corresponding 
random network, the clustering coefficient C also de- 
creases algebraically with N, and the C - N line rises, 
maintaining a constant slope, as a is decreased from 
to a - The lines for a < a are almost coincident with 
those of a — ao and the lines for a > are also almost 
coincident with those of a = 0. 

(iii) The degree distribution function P(k) (defined as 
the probability that a node has k links attached to it) 
is of irregular type (Fig. 5) for all reasonable values of 
a, indicating that the network is not of scale-free nature. 
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However, some other related plots do contain some in- 
teresting information. Thus, it is interesting to observe 
how the network grows for different values of a (Fig. 6). 
For large positive values of a, by breaking up an even 
number n one chooses only those pairs of primes that are 
far apart. One of the chosen primes will therefore be a 
small prime number, while the other will be one that is 
close to n. Very often one will find that the latter prime 
has not been included in the network till now and thus a 
new node is added. The network then grows in size. On 
the other hand, for small (large negative) values of a, the 
difference between the primes chosen will be small. Each 
prime will then be ~ (n/2) and will very often be found 
to be already present in the network. The network will 
then grow slowly. Such behaviour has been confirmed by 
simulation (Fig. 6). 
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FIG. 5. Degree distribution P(k) in the network con- 
structed from prime numbers. The different lines correspond 
to a = 2 (A), -0.1 (B), -0.5 (C), -2 (D). N = 5000. 

We have also studied the a dependence (Fig. 7) of av- 
erage connectivity < k > (which is simply kP{k) = 
2M/N, M being number of links and N being number of 
nodes) and fluctuation in connectivity defined as 

f(k) = ^(<k 2 >-<k > 2 ). 

Both these quantities display a sharp change at a = a . 
For a given number of nodes, the value of < k > will be 
large for a < ao and small for a > a since there are 
more links in the former case than in the latter. For a 
regular network, the nodes have the same degree rather 
uniformly, so that f(k) is small, but for a small- world net- 
work, some nodes are very rich in degree while the other 
nodes have low degree, rendering f(k) very large. More- 
over, for small- world network (a > a ) the rich nodes 



go on gaining links as the network evolves, so that the 
degree of the most-connected node (k m , say) increases 
with size of the network (Fig. 8). In the regular network 
regime, no node is preferentially linked and k m does not 
increase very much with iV and in fact approaches the 
average connectivity < k >. 
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FIG. 6. Growth of the prime number network for different 
values of a. The number of nodes (N) is plotted as a function 
of the number of links (M). Since at each time step one link is 
added, the X-axis also represents the time step. The numbers 
labelling the curves stand for the value of a. 




FIG. 7. Average connectivity < k > and fluctuation in 
connectivity f(k) as a function of a. N = 5000. 
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FIG. 8. Average connectivity < k > (broken line) and con- 
nectivity of the best connected node k m (continuous line) as a 
function of the number of nodes (TV) for different values of a. 
As a decreases, the < k > — N line rises, while the k m — N 
line descends. The numbers labelling the curves stand for the 
value of a. 
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FIG. 9. Clustering coefficient C(k) measured as a function 
of the degree. The numbers labelling the curves stand for the 
value of a. For a < — 1 the curve looks similar to that for 
a = — 1 but goes higher up and for a > 2 the curve becomes 
more flat and moves further down. N = 5000. 



(iv) Clustering coefficient C(k) measured as a function 
of the degree of a node has been proved to be an im- 
portant characteristic for many real networks [1,4]. This 
quantity is denned as Cj (as denned above) averaged by 
running i over only those nodes that have degree k. For 
a < 0, this quantity has a peak at a small (< 20) value of 
k, indicating that the nodes mostly have low degree and 
high clustering coefficient. On the other hand, for a > 0, 
C{k) is almost flat extending over a large range of values 
of k. (Fig. 9) 

(v) Degree-degree correlation function r, as proposed 
by Newman [5] measures the tendency of a link to have 
same type of degree (both high or both low) at its two 
ends. Thus, when r is positive, the network is assorta- 
tive, and a link prefers to have the same type of node at 
the two ends whereas, when r is negative, the network is 
disassortative, and a link prefers to have different type of 
nodes (one of high degree and the other of low degree) 
at the two ends. This parameter may be measured from 
the relation [5] 

where ji and fcj are the degrees of the nodes that are 
at the two ends of the i-th. link. For the network under 
study, the quantity r has been found to be positive (neg- 
ative) for negative (positive) values of a (Fig. 10). This 
indicates that the nature of the network changes from 
assortative to disassortative as a changes its sign. 
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FIG. 10. Degree-degree correlation coefficient r as a func- 
tion of a. Note that r bears a sign opposite to that of a. 
N = 5000. 



IV. CONCLUSION 

In conclusion, we have constructed a network of prime 
numbers with links placed on the basis of Goldbach con- 
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jecture. The network is of small world type when a pa- 
rameter a of the model is larger than —1.8 and of regular 
type when a is lower than —1.8. One must note that, for 
a > larger values of A (difference between the com- 
ponent primes) are preferred and the addition of a new 
link leads to a large prime getting attached to a small 
one. Thus, preferrcntial attachment to small primes are 
supported and the 'rich gets richer' principle leads to a 
small-world network, as for the case of Barabasi- Albert 
network [1] although the scale-free property is not ob- 
served. In any case, the small-world property indicates 
some pattern in the Goldbach decomposition of prime 
numbers. 
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